FUSE: Lightweight Guaranteed Distributed Failure Notification
نویسندگان
چکیده
FUSE is a lightweight failure notification service for building distributed systems. Distributed systems built with FUSE are guaranteed that failure notifications never fail. Whenever a failure notification is triggered, all live members of the FUSE group will hear a notification within a bounded period of time, irrespective of node or communication failures. In contrast to previous work on failure detection, the responsibility for deciding that a failure has occurred is shared between the FUSE service and the distributed application. This allows applications to implement their own definitions of failure. Our experience building a scalable distributed event delivery system on an overlay network has convinced us of the usefulness of this service. Our results demonstrate that the network costs of each FUSE group can be small; in particular, our overlay network implementation requires no additional liveness-verifying ping traffic beyond that already needed to maintain the overlay, making the steady state network load independent of the number of active
منابع مشابه
A Lightweight Object Manager for Group-Aware Applications
Groupware applications used in distributed office environments are characterized by several users accessing and manipulating a shared document pool in a collaborative context. Supporting such application with relational database persistence and object technology requires two main features: (1) a sound and generic mapping from the object-oriented application schema to the relational data model, ...
متن کاملGMD – Forschungszentrum Informationstechnik GmbH GMD Report
Groupware applications used in distributed office environments are characterized by several users accessing and manipulating a shared document pool in a collaborative context. Supporting such application with relational database persistence and object technology requires two main features: (1) a sound and generic mapping from the object-oriented application schema to the relational data model, ...
متن کاملContext-Linked Intelligent User Interfaces for Distributed Teams: An Astrophysics Case Study
There is a growing need for distributed teams to analyze complex and dynamic data streams and make critical decisions under time pressure. Although intelligent software capable of making decisions is becoming more and more prevalent, some highly ambiguous situations still demand the guidance of human experts. Via a case study, we discuss potential guidelines for the design of software tools to ...
متن کاملA Multi-traffic Inter-cell Interference Coordination Scheme in Dense Cellular Networks
This paper proposes a novel semi-distributed and practical ICIC scheme based on the Almost Blank SubFrame (ABSF) approach specified by 3GPP. We define two mathematical programming problems for the cases of guaranteed and besteffort traffic, and use game theory to study the properties of the derived ICIC distributed schemes, which are compared in detail against unaffordable centralized schemes. ...
متن کاملThe Session Based Fault Tolerance Algorithm of Platform EGO Web Service Gateway
Although grid computing has adopted Web services technology to deal with platforms heterogeneity and to enhance service and application interoperability, it is still a challenge to build web service applications with high reliability and availability to meet the requirements of grid communities. The paper discusses the design of Platform EGO WSG with high reliability. To support a huge user bas...
متن کامل